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Abstract. Non-equilibrium systems have long-ranged spatial correlations even far 
away from critical points. This implies that the likelihoods of spatial steady state 
profiles of physical observables are nonlocal functionals. In this letter, it is shown 
that these properties are essential to a successful analysis of a functional level inverse 
problem, in which a macroscopic non-equilibrium fluctuation field is estimated from 
limited but spatially scattered information. To exemplify this, we dilute an out-of- 
equilibrium fluid flowing through random media with a marker, which can be observed 
in an experiment. We see that the hidden variables describing the random environment 
result in spatial long-range correlations in the marker signal. Two types of statistical 
estimators for the structure of the underlying media are then constructed: a linear 
estimator provides unbiased and asymptotically precise information on the particle 
density profiles, but yields negative estimates for the effective resistances of the media 
in some cases. A nonlinear, maximum likelihood estimator, on the other hand, results 
in a faithful media structure, but has a small bias. These two approaches complement 
each other. Finally, estimation of non-equilibrium fluctuation fields evolving in time is 
discussed. 

Keywords: transport processes / heat transfer (theory), stationary states, 
disordered systems (theory), new applications of statistical mechanics 
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1. Introduction 

Interacting non-equilibrium systems exhibit spatial long-range correlations, in which 
the equal-time autocorrelation functions of physical observables show slower than 
exponential decay as a function of the distance between points of measurement. 
Presence of conservation laws, spatial anisotropies and lack of detailed balance are 
typical requirements for their emergence pQ. In some fluids, the couplings between 
hydrodynamic fields is the dominant cause for the correlations [2J. In contrast to 
thermodynamic equilibrium, proximity of a phase boundary is not needed. 

The long-range correlations show up in the fluctuations of observables on length- 
scales comparable to the system size. The statistics of these fluctuations are captured 
by the likelihoods of macroscopic profiles, which often satisfy a large deviation principle. 
In those cases, the likelihood is characterized by a large deviation functional, a 
generalization of the equilibrium concept of free energy, which measures the macroscopic 
fluctuations in terms of exponentially small probabilities |3j. As an example, the 
closeness of a stationary but fluctuating fluid density field to an arbitrary function / 
in a volume V could be described by a large deviation functional T as 



with J- attaining its minimum at the most likely profile. 

It has been shown that the non-equilibrium large deviation functionals are nonlocal, 
in that the likelihood of a profile is not additive under the operation of uniting two 
subsystems into a larger, joint system pf]. Instead, the probability of observing a given 
profile in a subdomain depends also on the form of the same profile elsewhere, i.e. on 
the global structure. This is a consequence of spatial long-range correlations. 

Long-range correlations and nonlocal profile likelihoods suggest that it is possible 
to analyze a functional level inverse problem: even small pieces of information on local, 
spontaneous fluctuations can be translated by an analytical and numerical machinery 
to the language of fluctuations on a global scale. Building such machinery is the topic 
of this letter. Working through a specific application of determining the structure of 
random media using marker particle data, we see how to extract useful information 
about the fluctuation state of a random, macroscopic field from a weak signal. 

The problem of estimating the random media structure is attractive not only 
because of its potential applications, but also because it offers a unique view to spatial 
correlations in non-equilibrium systems and their state estimation in general. First, we 
see that hidden information, in this case a static random environment, can bring about 
an effective particle-particle interaction which results in long-ranged spatial correlations. 
In fact, for the system under study, the correlations are entirely due to hidden variables 
because of a special symmetry in the dynamics. The analysis is then transparent because 
estimation is based on a single type of correlation. Second, these long-range correlations 
turn out to be of a very common form, given by a piecewise linear covariance. Such 
covariance functions can be found in a class of driven diffusive systems (including 
exclusion and KMP processes [5]). Also the correlations induced by a coupling of 




(1) 
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temperature and velocity fluctuations in a Rayleigh-Benard system seem to be of the 
same form [2] . We discuss a general technique for state estimation of piecewise linearly 
correlated systems. 

In the following sections, we define the problem of media estimation more precisely 
and describe two complementary solutions. State estimation of non-equilibrium systems 
with time-dependent fluctuation fields is then briefly discussed. 



2. Media estimation 



We study transport of particles through one-dimensional random media, which is 
modeled by random single-particle transition rates between microscopic unit cells. The 
size of the unit cells is determined by the media correlation length, above which the 
transition rates have independent statistics. We assume spatial homogeneity of the 
media in a statistical sense. The rate at which particles move from the microscopic cell 
i to cell % + 1 and vice versa, is a random number Vi, and these random numbers are 
independent and identically distributed. The system is driven out of equilibrium by a 
chemical potential difference at two boundary reservoirs. The transition rates vq and vl 
at the boundaries are set equal to unity. 

The stationary states of particle transport in random media were studied in [6 J for a 
class of particle interactions. In these particle systems, the fugacity profile is a monotone 
function, which can be expressed in terms of partial resistances Rj — 1 + 5^f=i V I 1 as 

A = ^-Tf5l)+^rfk> ' ' ' (2) 

where 0_ and <p + are the fugacities at the left and right boundary reservoirs, respectively. 
The number of particles in a cell j is a function of <pj only. Thus the structure of 
stationary density profiles on a macroscopic scale depends on the statistics of the 
resistances v^ 1 . For finite expected resistances, the fugacity converges to a linear, 
deterministic function by the law of large numbers as the number of unit cells diverges. 
On the other hand, for E v^ 1 = oo, percolation effects are important. In terms of a 
macroscopic spatial coordinate x G [0,1], the fugacity converges in distribution to a 
random function 

where R is an a-stable non-decreasing Levy process [7]. Here a + 1 G (1,2) is the 
power-law exponent for the tail of the distribution for the resistances v~ l . For a < 1, 
the fugacity profile has many small jumps, but it typically stays close to the expected 
profile 

p(x) := E(j)(x) = - x) + (4) 

On the other hand, for a small, the total resistance is dictated by just a few bottlenecks, 
which show up as large jumps in the fugacity profile, and as plateaus between the jumps 
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(see figure [T] below for examples). This is captured by the non- vanishing covariance 

C(x, y) := E 4>{x)4>{y) - p(x)p(y) = (1 - «)(</>+ - 0-) 2 (x Ay)(l-xVy), (5) 

where A and V denote the minimum and maximum of two numbers, respectively. 

We consider the limit of weak particle-particle interactions, in which case fugacity 
equals the particle density |Bj. By equation ([5]), the correlations in the density profile 
are non-exponential, and mediated by the disordered media. They also vanish in 
equilibrium, i.e. when 0_ = + . Thus static hidden variables in a non-equilibrium 
system can result in an effective interaction, and to long-range correlations. 

To obtain an image of the underlying media, we dilute the boundary reservoirs 
with marker particles. Only those can be observed in an experiment. For simplicity, 
the fraction of markers is taken to be the same in both reservoirs, but still 0_ ^ + . 
For a very dilute marker, the numbers of observed marker particles in disjoint sets 
of macroscopic size within the media are asymptotically independent and Poisson 
distributed with the intensity of the distribution proportional the integral of the particle 
densit}||} 

P (rt =1 {AT(A) = m,}|0) = f[ HMle-MA^ A(A) = c f dx ? (6) 

i=1 m i- JAi 

where Ai C [0, 1] are disjoint macroscopic sets. In other words, a steady state snapshot 
of marker particles is a Cox (or doubly stochastic Poisson point) process directed by a 
random measure A with density c0 [El E]- In the following, we take c = 1. 

Given the relation ^ between the the partial resistances and the particle density, 
we see that 

rM)= r(0 , M ) = ^f^ (7) 

(f) + - 0_ 

is the resistance of a macroscopic interval (a, b] relative to the total resistance of the 
media (which can be determined from a flow experiment). In order to infer this quantity 
for any a and b from a snapshot of the marker particles, i.e. the Cox process data at 
hand, we need a reliable estimator for the density profile <fi. Bayesian analysis shows 
the minimum mean square error estimator (MMSE estimator) of <f) given markers at 
locations (xi, . . . , x n ) is [H [10] 



An obvious choice for the estimator of the relative resistance would then be 

f(a, b) = r(a, 6; (p). (9) 
However, expectations in expression (pi) are not analytically tractable to the author's 



knowledge (even the single-point moments E (j)(x) n have quite complicated formulae [6]), 
and therefore good approximative estimators are needed. 

| Mathematically, this is obtained by having a fraction of markers in each reservoir scale in inverse 
proportion to the number of unit cells. 
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2.1. Linear estimation 

The MMSE estimator ^ is in general a nonlinear function of the observations. In 
particular, it lacks additivity under inclusion of new information. Nonlinear estimation 
can be computationally costly because the estimator has to be recalculated completely 
each time new data becomes available. We next consider linear estimators of the form 



L (x)= / k(x,y)dN(y), (10) 
J[o,i) 

where N is the counting measure for marker particles in a snapshot (i.e. a Poisson 
random measure with random directing intensity measure A(A) = f A <p(x) dx). In 
particular, L ^ k(x, y) dN(y) = J2™ = ik(x,Xi) for n observed particles at locations Xi. 
The deterministic kernel k translates each observed particle to the language of global 
density fluctuations separately. 

The original construction of a MMSE linear estimator for a general Cox process with 
a directing density is due to Grandell pjj. An uncomplicated derivation uses the fact 
that the space in which the estimator errors are measured, the space of square integrable 
functions L 2 (P), is a Hilbert space. The trick is that the MMSE linear estimator is a 
projection to the linear subspace spanned by functionals of the form Jj Q ^ f(y) dN(y) 
[12J. Thus the error function — 0l has to be orthogonal to every such functional: 

E {[<f>(x) - fa(x)} [ f(y)dN(y)}=0. (11) 
L J [o,i] } 

A straightforward calculation^] shows that the orthogonality relation is solved by 



cf> L (x) = p(x) + / K(x,y)[dN(y)-p(y)dy] (12) 

J [0,1] 

if the kernel K satisfies the integral equation 



K(x, y) P (y) + / K(x, z)C(z } y) dz = C(x, y). (13) 

'[0,1] 



A remarkable feature of the result ( 12 ),( 13 ) is that only the mean p(x) and the covariance 
C(x, y) of the directing density are used in the construction. The information on higher 
moments and correlations is neglected. In this sense, linear estimation is a mean field 
approximation reminiscent of Gaussian approximations. 

For media estimation with the mean and covariance of the particle density given 
by equations (III) and ([5]), equation (13) for the kernel becomes a coupled pair of integral 
equations because the covariance function C(x,y) is piecewise defined. However, due 
to piecewise linearity of C(x,y), differentiation of these equations twice with respect to 
y leads to two decoupled second order differential equations. Consequently, the linear 
media estimation has an explicit solution: 

K(x,y)= S^'t^L, ^ ' Kx,y) = g4xAy)g + (xVy), (14) 

n(x, x)p(x) + J n{x, z)C(z,x) dz 

§ Observe that E dN(x) dJV(y) = (S(x - y)E (/>(x) + E (f>(x)^>(y)) dx dy . 




Figure 1. Linear (dashed) and maximum likelihood (dash-dotted) estimators for 
simulated marker density profiles (solid line) at (a) a — 0.95, <f>+ — 30, and (b) 
a = 0.5, <p+ = 100. In both cases, 0_ = 0. The black circles on shaded background 
show the observed marker particles. Insets: the kernel functions K(x,0.25) (solid line) 
and K(x, 0.75) (dashed). 



where the auxiliary functions g + and g_ are given in terms of modified Bessel functions 
Ji and K\ (see [13] for definition and properties) as 



Ii(2 (I - a)<p+) 

g + (x) = - T ={h (2y/(l- a )p(x)) V ' K, (2y/(l- a )p(x)) }.(15) 

VpM 1 V t K 1 (2./(l-a)0 + ) V 'J 



Figure [T] shows the linear estimator for density profiles obtained from simulations 
of a-stable processes for two disorder strengths a. In both cases, the insets show the 
kernel K, i.e. the effect of a single particle. As the true density profiles get closer to 
the expected profile as a — > 1, also the effect a single observed particle on the estimator 
gets smaller, and the function K(x, y) thus gets less peaked around y. This is due to 
unbiasedness of the linear estimator, E 4>l{%) — p{x). Remarkable is also the asymmetry 
of the peaks, which is a consequence of the boundary conditions imposed on the density 
profiles. 

Figure [l] (b) shows that the linear estimator yields non-monotone density profiles, 
which lead to negative resistance estimates r^(a,b) = r(a, b; defined through ffi. 
The problem is not serious at large marker densities: for < </>_ := ct_7 < cr + 7 =: <p + , 

y/1 — a ( \/<J+b + ct_(1 — b) + ^a+a + — a) ) 

V7-^ [(r L (a, 6) - r(a, 6)) 2 ] ^ , (16) 

2(ct+ - cr_J 

as 7 oo, in that the estimation error vanishes as the inverse square root of the marker 
density. The proof is based on the previous Hilbert space techniques, in combination 
with asymptotic formulae for the modified Bessel functions. 

The constrained optimization problem of finding a linear monotone estimator 
seems much more difficult to solve than the unconstrained one. However, the negative 
resistances can also be avoided by truncating the negative values of the linear resistance 
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estimator fi. This also reduces the estimation error. Alternatively, one can inspect the 
increments of the best monotone approximation to 0l in the uniform topology [14J, 

Lon,L(x) = \( sup £{y) + inf £(z) J , £(x) = ^ V fa(x) A 0+. (17) 

£ \0<y<x x<z<l J 

Next we introduce a monotone estimator that complements the linear one. 
2.2. Maximum likelihood estimator 

In order to find an inherently monotone estimator, we neglect the information that we 
have on the statistics of the density profile increments and look for a non- decreasing 
function that maximizes the likelihood function 



Jio.ii <t>( x ) Ax ' 

i=l 

under the assumption that </>_ < <j){x\) < . . . < <fi(x n ) < <f> + . The problem has been 
discussed for non-negative functions by numerous authors (see e.g. [T51 US])- However, 
a heuristic construction is hard to find in the literature, and since we have to include 
the extra condition 0_ < 0(xi), we provide a derivation. 



Suppose one fixes the values ipi = (p(xi) in expression (18). Then the likelihood 
maximizing function is the one that minimizes the integral in the same expression, 
which is nothing but the lower step function (the subscript ML stands for maximum 
likelihood) 

{</>_ for = x < x < X\ , 

4>- V ipi A 0+ for Xi < x < x i+ i, i = l..., ra, (19) 
4> + for x = x n+ \ = 1 . 

The true problem is to find the optimal values for By inspecting the log-likelihood 

rt 

log L(<f>,( Xi )) = -log0_ + ^[log^ -ipAxi], (20) 

i=0 

where Ax« = — Xj, we see that if Axj_i > Axj for every % = 1, . . . ,n, the optimal 
choices are ipi = l/Axi. 

Suppose that the required monotonicity of the gap sequence (Aa^) is broken at 
particle i, in that Axj_i < Ax{. Clearly one needs to level the values ijji-i and ipi in 
such a way that the sequence is monotone again; V'i-i.i = tfi = "0*— i- This common value 
appears as logipf_ li — ^ i _i ) j(Axj_i + Axi) in the log-likelihood, so the new optimizer 
reads ipi-ij = 2/(Axj_i + Axi). If after that ipj-2 < ip%-i,i < ipj+i is violated, one 
makes the same adjustment for a necessary number of values to the left and to the right 
from the particle at X{. The log-likelihood involving adjustments of ipj and ipk with 
j < i < k has a term log ip^ +1 ^^ — ipi(Axj + . . . + Ax^), which leads to an optimal 
density ipi = (k + l—j)/ (xk+i — Xj) on that plateau. The condition that the adjustment 
process ends at index j at left and at k at right is 

1 k + l-j 1 fc + l-O'-l) k + l-j k + 2-j 

<— J - < <^> — ^ -<— < — -, (21) 



AXj-i Xfc+i — Xj Alfc + i %k+l ~ x j-l Xk+l — Xj Xk + 2 — Xj 
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and hence 



/ . k+l-j 

ipi = max mm . (22) 



The maximum likelihood (ML) estimator, given by ( 19 ) and ( 22 ) , is plotted in figure 
[T] The estimated density profiles consist of discontinuities at marker particle positions 
and of plateaus in between them. This makes the ML estimator flexible as compared to 
the linear estimator and therefore particularly suited for estimation at small values of 
a. For the same reason, the corresponding relative resistance measure does not have a 
continuous density as in linear estimation, but a spike train just like the original media. 

The lack of information on the disorder strength a, however, poses some problems 
at low marker densities. Especially for a close to one, the estimated profiles are typically 
far away from the expected profile p(x), and the ML estimator is outperformed by the 
linear estimator (see figure [T] (a)). 

Contrary to the linear estimator for the density, the ML estimator is biased. It 
easily underestimates the density at very low marker densities because in absence of 
marker particle observations, 0ml {x) — (f>- for x < 1. 



3. Discussion 



We have shown that long-range range correlations present in non-equilibrium systems 
can be used to extract information on an underlying fluctuation field even from very 
limited information. We applied the methods from statistical state estimation theory to 
particle transport in disordered media and estimated the media structure from a dilute 
marker signal. Linear estimation turned out to deliver unbiased but not necessarily 
positive estimates for the effective resistances of the media. This could be remedied by 
truncation or by monotonizing the density estimator. On the other hand, a nonlinear, 
maximum likelihood estimator yielded positive resistances but has an inherent bias, 
which is severe only at very small marker densities. Although quantitative results about 
the asymptotic error of the estimates could be derived in the linear case only, numerical 
simulations indicate fast convergence of the ML estimator. 

The state-estimation of a non-equilibrium system from marker data was based on 
the assumption that the marker particles in a stationary flow are described by a Cox 
process. Furthermore, the directing density of the process was taken to be of the same 
form as the total particle density. This assumption can be broken in systems with steric 
effects, such as particle-particle exclusion, in which case one has to be able to tell, from 
which reservoir the particles originated from. 

In driven systems without quenched disorder, the long-range correlations are often 
weak, in that the amplitude of the correlations decays as a function of the system size. It 
is only the macroscopic correlations of a suitably rescaled fluctuation field that persist in 
the scaling limit [T71 H] . This is in contrast to non- vanishing correlations in the density 
profiles in random media. Probably a stronger marker signal is required for a successful 
estimation in case of weak correlations. 
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The asymptotic correlations in a class of driven diffusive systems are of the same 
form as in the media estimation problem, apart from the system size dependent 
amplitude [5]. Thus one can apply the linear estimation machinery to state estimation of 
large but finite systems in this class. Another approach is through microscopic solutions 
and generating functions such as used in the analysis of large deviations [I] . Pursuing 
these themes further will give information on the feasibility of state estimation for 
general non-equilibrium systems. 
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